Approximating an Interlingua in a Principled Way

نویسندگان

  • Eduard H. Hovy
  • Sergei Nirenburg
چکیده

We address the problem of constructing in a principled way an ontology of terms to be used in an interlingua for machine translation. Given our belief that the a true language-neutral ontology of terms can only be approached asymp-totically, the construction method outlined involves a step-wise folding in of one language at a time. This is effected in three steps: first building for each language a taxonomy of the linguistic generalizations required to analyze and generate that language, then organizing the domain entities in terms of that taxonomy, and finally merging the result with the existing interlingua ontology in a well-defined way. This methodology is based not on intuitive grounds about what is and is not 'true' about the world, which is a question of language-independence, but instead on practical concerns, namely what information the analysis and generation programs require in order to perform their tasks, a question of language-neutrality. After each merging is complete, the resulting taxonomy contains, declaratively and explicitly represented , those distinctions required to control the analysis and generation of the linguistic phenomena. The paper is based on current work of the PANGLOSS MT project. 1. Interlinguas This paper presents a method of constructing in a prin-cipled way an ontology of terms to be used as a source of terms in an interlingua for machine translation. The method involves taxonomizing relevant linguistic phenomena in each language and merging the resulting tax-onomy with the interlingua ontology to produce an on-tology that explicitly records the phenomena that must be handled by any parser or generator and is neutral with respect to the languages handled by the system. 1.1. What is an Interlingua? In interlingual machine translation, the representational power of the interlingua is central to the success of the translation. By interlingua we mean a notation used in MT systems to represent the propositional and pragmatic meaning of input texts; an interlingua text that represents the meaning of a source language text is produced by computational analysis, and is then used as input to a generation module which realizes this meaning in a target language text. An interlingua consists of the following three parts: a collection of terms: the elements that represent individual meanings (of lexical items, pragmatic aspects , etc.); this collection is organized in a multiply interconnected semantic network; notation: the syntax to which well-formed interlin-gua texts conform; substrate: the knowledge representation system in which interlingua …

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ultimate approximation and its application in nonmonotonic knowledge representation systems

In this paper we study fixpoints of operators on lattices and bilattices in a systematic and principled way. The key concept is that of an approximating operator, a monotone operator on the product bilattice, which gives approximate information on the original operator in an intuitive and well-defined way. With any given approximating operator our theory associates several different types of fi...

متن کامل

Approximating the Distributions of Singular Quadratic Expressions and their Ratios

Noncentral indefinite quadratic expressions in possibly non- singular normal vectors are represented in terms of the difference of two positive definite quadratic forms and an independently distributed linear combination of standard normal random variables. This result also ap- plies to quadratic forms in singular normal vectors for which no general representation is currently available. The ...

متن کامل

The Structure of Bhattacharyya Matrix in Natural Exponential Family and Its Role in Approximating the Variance of a Statistics

In most situations the best estimator of a function of the parameter exists, but sometimes it has a complex form and we cannot compute its variance explicitly. Therefore, a lower bound for the variance of an estimator is one of the fundamentals in the estimation theory, because it gives us an idea about the accuracy of an estimator. It is well-known in statistical inference that the Cram&eac...

متن کامل

The Role of Reversible Grammars in Translating Between Representation Languages

A capability for translating between representation languages is critical for effective knowledge base reuse. We describe a translation technology for knowledge representation languages based on the use of an interlingua for communicating knowledge. The interlingua-based translation process can be thought of as consisting of three major steps: (1) translation from the source language into a sub...

متن کامل

An interlingua based on domain actions for machine translation of task-oriented dialogues

This paper describes an interlingua for spoken language translation that is based on domain actions in the travel planning domain. Domain actions are composed of speech acts (e.g., requestinformation), attributes (e.g., size, price), and objects (e.g., hotel, flight) and can take arguments. Development of the interlingua is guided by a database containing travel dialogues in English, Korean, Ja...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1992